32 research outputs found

    Probabilistic Auto-Associative Models and Semi-Linear PCA

    Full text link
    Auto-Associative models cover a large class of methods used in data analysis. In this paper, we describe the generals properties of these models when the projection component is linear and we propose and test an easy to implement Probabilistic Semi-Linear Auto- Associative model in a Gaussian setting. We show it is a generalization of the PCA model to the semi-linear case. Numerical experiments on simulated datasets and a real astronomical application highlight the interest of this approac

    Auto-Associative models and generalized Principal Component Analysis

    Get PDF
    International audienceIn this communication, we propose auto-associative (AA) models to generalize Principal component analysis (PCA). AA models have been introduced in data analysis from a geometrical point of view. They are based on the approximation of the observations scatter-plot by a differentiable manifold. They are interpreted as Projection pursuit models adapted to the auto-associative case. Their theoretical properties are established and are shown to extend the PCA ones. An iterative algorithm of construction is proposed and its principle is illustrated both on simulated and real data from image analysis

    blockcluster, simerge and C++ with R

    Get PDF
    International audienc

    Block clustering of Binary Data with Gaussian Co-variables

    Get PDF
    The simultaneous grouping of rows and columns is an important technique that is increasingly used in large-scale data analysis. In this paper, we present a novel co-clustering method using co-variables in its construction. It is based on a latent block model taking into account the problem of grouping variables and clustering individuals by integrating information given by sets of co-variables. Numerical experiments on simulated data sets and an application on real genetic data highlight the interest of this approach

    Estimation of Parsimonious Covariance Models for Gaussian Matrix Valued Random Variables for Multi-Dimensional Spectroscopic Data

    Get PDF
    International audienceSatellite remote sensing makes it possible to observe landscapes on large spatial scales. The Sentinel-1 and Sentinel-2 satellites currently provide full coverage of the national territory of France every 5 days. Due to the orbit of the satellites, coupled with the presence of clouds, thesampling of the pixels are temporally irregular. The project aims to develop, study and implement supervised and unsupervised classification methods when the data are of different natures (heterogeneous) and have missing and/or aberrant data. The methods implemented are developed to process satellite and aerial data for ecology and cartography

    Rmixmod: The R Package of the Model-Based Unsupervised, Supervised and Semi-Supervised Classification Mixmod Library

    Get PDF
    International audienceMixmod is a well-established software package for fitting a mixture model of multivariate Gaussian or multinomial probability distribution functions to a given data set with either a clustering, a density estimation or a discriminant analysis purpose. The Rmixmod S4 package provides a bridge between the C++ core library of Mixmod (mixmodLib) and the R statistical computing environment. In this article, we give an overview of the model-based clustering and classification methods, and we show how the R package Rmixmod can be used for clustering and discriminant analysis
    corecore